Exponentially fast convergence to (strict) equilibrium via hedging
نویسندگان
چکیده
Motivated by applications to data networks where fast convergence is essential, we analyze the problem of learning in generic N-person games that admit a Nash equilibrium in pure strategies. Specifically, we consider a scenario where players interact repeatedly and try to learn from past experience by small adjustments based on local – and possibly imperfect – payoff information. For concreteness, we focus on the so-called “hedge” variant of the exponential weights (EW) algorithm where players select an action with probability proportional to the exponential of the action’s cumulative payoff over time. When the players have perfect information on their mixed payoffs, the algorithm converges locally to a strict equilibrium and the rate of convergence is exponentially fast – of the order of O(exp(−a ∑t j=1 γj)) where a > 0 is a constant and γj is the algorithm’s step-size. In the presence of uncertainty, convergence requires a more conservative step-size policy, but with high probability, the algorithm still achieves an exponential convergence rate.
منابع مشابه
Fast and Slow Convergence to Equilibrium for Maxwellian Molecules via Wild Sums
We consider the spatially homogeneous Boltzmann equation for Maxwellian molecules and general finite energy initial data: positive Borel measures with finite moments up to order 2. We show that the coefficients in the Wild sum converge strongly to the equilibrium, and quantitatively estimate the rate. We show that this depends on the initial data F essentially only through on the behavior near ...
متن کاملHedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence
This paper examines the problem of multi-agent learning in N -person non-cooperative games. For concreteness, we focus on the socalled “hedge” variant of the exponential weights (EW) algorithm, one of the most widely studied algorithmic schemes for regret minimization in online learning. In this multi-agent context, we show that a) dominated strategies become extinct (a.s.); and b) in generic g...
متن کاملLearning strict Nash equilibria through reinforcement
This paper studies the analytical properties of the reinforcement learning model proposed in Erev and Roth (1998), also termed cumulative reinforcement learning in Laslier et al. (2001). The main results of the paper show that, if the solution trajectories of the underlying replicator equation converge exponentially fast, then, with probability arbitrarily close to one, all the pathwise realiza...
متن کاملEffect of Exponentially-Varying Properties on Displacements and Stresses in Pressurized Functionally Graded Thick Spherical Shells with Using Iterative Technique
A semi-analytical iterative method as one of the newest analytical methods is used for the elastic analysis of thick-walled spherical pressure vessels made of functionally graded materials subjected to internal pressure. This method is accurate, fast and has a reasonable order of convergence. It is assumed that material properties except Poisson’s ratio are graded through the thickness directio...
متن کاملFinite time convergence analysis for “Twisting” controller via a strict Lyapunov function
Resumen—A second order sliding mode controller, the so-called “Twisting” algorithm is under study. A non-smooth strict Lyapunov function is proposed, so global finite time stability for this algorithm can be proved, even in the case when it is affected by bounded external perturbations. The strict Lyapunov function gives the possibility to estimate an upper bound for the time convergence of the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1607.08863 شماره
صفحات -
تاریخ انتشار 2016